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Abstract 

We consider a class of infinite-state stochastic games generated by stateless 
pushdown automata (or, equivalently, 1-exit recursive state machines), where 
the winning objective is specified by a regular set of target configurations and 
a qualitative probability constraint '>0' or '=1'. The goal of one player is 
to maximize the probability of reaching the target set so that the constraint 
is satisfied, while the other player aims at the opposite. We show that the 
winner in such games can be determined in P for the '>0' constraint, and 
in NP n co-NP for the '=1' constraint. Further, we prove that the winning 
regions for both players are regular, and we design algorithms which compute 
the associated finite-state automata. Finally, we show that winning strategies 
can be synthesized effectively. 

1. Introduction 

Stochastic games are a formal model for discrete systems where the be- 
havior in each state is either controllable, adversarial, or stochastic. Formally, 
a stochastic game is a directed graph G with a denumerable set of vertices V 
which is split into three disjoint subsets Vq, Vo, and Vq. For every v G Vq, 
there is a fixed probability distribution over the outgoing edges of v. We 
also require that the set of outgoing edges of every vertex is nonempty. The 
game is initiated by putting a token on some vertex. The token is then 
moved from vertex to vertex by two players, □ and O, who choose the next 
move in the vertices of and V^, respectively. In the vertices of Vq, the 
outgoing edges are chosen according to the associated fixed probability dis- 
tribution. A quantitative winning objective is specified by some Borel set W 
of infinite paths in G and a probability constraint \>q, where > G {>, >} is 
a comparison and q G [0, 1]. An important subclass of quantitative winning 
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objectives are qualitative winning objectives where the constant g must be 
either or 1. The goal of player □ is to maximize the probability of all runs 
that stay in W so that it is >-related to g, while player O aims at the op- 
posite. A strategy specifies how a player should play. In general, a strategy 
may or may not depend on the history of a play (we say that a strategy 
is history- dependent (H) or memoryless (M)), and the edges may be chosen 
deterministically or randomly {deterministic (D) and randomized (R) strate- 
gies). In the case of randomized strategies, a player chooses a probability 
distribution on the set of outgoing edges. Note that deterministic strategies 
can be seen as restricted randomized strategies, where one of the outgoing 
edges has probability 1. Each pair of strategies (cr, vr) for players □ and O 
determines a play, i.e., a unique Markov chain obtained from G by applying 
the strategies a and vr in the natural way. The outcome of a play initiated 
in V is the probability of all runs initiated in v that are contained in the set 
W (this probability is denoted by V"''^{W)). We say that a play is (l>f))-won 
by player □ if its outcome is O-related to q] otherwise, the play is (j2^f))-won 
by player O. A strategy a of player □ is {\>q) -winning if for every strategy 
TT of player O, the corresponding play is (>f))-won by player □. Similarly, a 
strategy vr of player O is {^Q)-winning if for every strategy a of player □, the 
corresponding play is (j2^>^)-won by player O. A natural question is whether 
the game is determined, i.e., for every choice of > and either player □ has a 
(>^)-winning strategy, or player O has a ( win ning strategy. The answer 
is somewhat subtle. A celebrated result of Martin (iTf (see also fli'l) implies 
that stochastic games with Borel winning conditions are weakly determined, 
i.e., each vertex v has a value given by 



Here a and vr range over the sets of all strategies for player □ and player O, 
respectively. From this we can immediately deduce the following: 

• If both players have optimal strategies that guarantee the outcome 
val{v) or better against every strategy of the opponent (for example, 
this holds for finite-state stochastic games and the "usual" classes of 
quantitative/ qualitative Borel objectives), then the game is determined 
for every choice of >g. 

• Although optimal strategies are not guaranteed to exists in general. 
Equation [T] implies the existence of e-optimal strategies (see Defini- 
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tion 12. 3p for every e > 0. Hence, the game is determined for every 
choice oi \>g where g ^ val{v). 



The only problematic case is the situation when optimal strategies do not 
exist and q = val{v). The example given in Figure [T] at page [H] witnesses that 
such games are generally not determined, even for reachability objectives. 
On the other hand, we show that finitely- branching games (such as BPA 
games considered in this paper) with reachability objectives are determined, 
although an optimal strategy for player □ in a finitely-branching game does 
not necessarily exist. The determinacy question for finitely-branching games 
and other classes of (Borel) winning objectives is left open. 

Algorithmic issues for stochastic games with quantitative/qualitative win- 
ning objectives have been studied mainly for finite-state stochastic games. 
A lot of attention has been devoted to quantitative reachability objectives, 
including the special case when = |- The problem whether player □ has a 
(>|)-winning strategy is known to be in NP fl co-NP, but its membership 



to P is a long-standing open problems in algorithmic game theory j8|, |20|. 
Later, more complicated qualitative/quantitative cj-regular winning objec- 
tives (such as Biichi, co-Biichi, Rabin, Street, MuUer etc.) were considered, 
and the coniplexity of the corresponding decision problems was analyzed. 



We refer to |5|-|7|, |9|, ll9|, |2l| for more details. As for infinite-state stochastic 



games, the attention has so far been focused on stochastic games induced 
by lossy channel systems and by pushdown automata (or, equivalently, 
recursive state machines) \4. Illl-ll4|. In the next paragraphs, we discuss the 
latter model in greater detail because these results are closely related to the 
results presented in this paper. 

A pushdown automaton (PDA) (see, e.g., (l5|) is equipped with a finite 
control unit and an unbounded stack. The dynamics is specified by a finite 
set of rules of the form pX qa, where p, q are control states, X is a stack 
symbol, and a is a (possibly empty) sequence of stack symbols. A rule of 
the form pX qa is applicable to every configuration of the form pXP and 
produces the configuration qaP. If there are several rules with the same left- 
hand side, one of them must be chosen, and the choice is made by player □, 
player O, or it is randomized. Technically, the set of all left-hand sides (i.e., 
pairs of the form pX) is split into three disjoint subsets H^, H^, and Hq, 
and for all pX G Hq there is a fixed probability distribution over the set 
of all rules of the form pX ^ qa. Thus, each PDA induces the associated 
infinite-state stochastic game where the vertices are PDA configurations and 
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the edges are determined in the natural way. An important subclass of PDA 
is obtained by restricting the number of control states to 1. Such PDA are 
also known as stateless PDA or (mainly in concurrency theory) as BPA. 
PDA and BPA correspond to recursive state machines (RSM) and 1-exit 
RSM respectively, in the sense that their descriptive powers are equivalent, 
and there are effective linear-time translations between the corresponding 
models. 



In [12], the quantitative and qualitative termination objective for PDA 
and BPA stochastic games is examined (a terminating run is a run which 
hits a configuration with the empty stack; hence, termination is a special 
form of reachability). For BPA, it is shown that the vector of optimal values 
{val{X), X G r), where F is the stack alphabet, forms the least solution of 
an effectively constructible system of min-max equations. Moreover, both 
players have optimal MD strategies which depend only on the topmost stack 
symbol of a given configuration (such strategies are called SMD, meaning 
Stackless MD). Hence, stochastic BPA games with quantitative/qualitative 
termination objectives are determined. Since the least solution of the con- 
structed equational system can be encoded in first order theory of the reals, 
the existence of a (>f))-winning strategy for player □ can be decided in poly- 
nomial space. In the same paper [l2| . the fl 11^ upper complexity bound 
for the subclass of qualitative termination objectives is established. As for 
PDA games, it is shown that for every fixed e > 0, the problem to distinguish 
whether the optimal value val{pX) is equal to 1 or less than e, is undecid- 
able. The fl 11^ upper bound for stochastic BPA games with qualitative 
termination objectives is improved to NP fl co-NP in [i3\- In the same 
paper, it is also shown that the quantitative reachability problem for finite- 
state stochastic games (see above) is efficiently reducible to the qualitative 
termination problem for stochastic BPA games. Hence, the NP fl co-NP 
upper bound cannot be further improved without a major breakthrough in 
algorithmic game theory. In the special case of stochastic BPA games where 
i/o = or = 0, the qualitative termination problem is shown to be in P 
(observe that if = or = 0, then a given BPA induces an infinite-state 
Markov decision process and the goal of the only player is to maximize or 
minimize the termination probability, respectively). The results for Markov 
decision processes induced by BPA are generalized to (arbitrary) qualitative 
reachability objectives in [4], retaining the P upper complexity bound. In the 
same paper, it is also noted that the properties of reachability objectives are 
quite different from the ones of termination (in particular, there is no appar- 
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ent way how to express the vector of optimal values as a solution of some 
recursive equational system, and the SMD determinacy result (see above) 
does not hold either). 

Our contribution: In this paper, we continue the study initiated in 

0, in 14 and solve the qualitative reachability problem for unrestricted 
stochastic BPA games. Thus, we obtain a substantial generalization of the 
previous results. 

We start by resolving the determinacy issue in Section [3l We observe that 
general stochastic games with reachability objectives are not determined, and 
we also show that finitely branching stochastic games (such as BPA stochastic 
games) with quantitative/qualitative reachability objectives are determined, 

1. e., in every vertex, either player □ has a (>f))-winning strategy, or player O 
has a (j2^>f))-winning strategy. This is a consequence of several observations 
that are specific to reachability objectives and perhaps interesting on their 
own. 

The main results of our paper, presented in Sections [5l O and [7] con- 
cern stochastic BPA games with qualitative reachability objectives. In the 
context of BPA, a reachability objective is specified by a regular set T of tar- 
get configurations. We show that the problem of determining the winner in 
stochastic BPA games with qualitative reachability objectives is in P for the 
'>0' constraint, and in NP fl co-NP for the '>!' constraint. Here we rely 



on the previously discussed results about qualitative termination |13| and 
use the corresponding algorithms as "black-box procedures" at appropriate 
places. We also rely on observations presented in [^] which were used to solve 
the simpler case with only one player. However, the full (two-player) case 
brings completely new complications that need to be tackled by new meth- 
ods and ideas. Many "natural" hypotheses turned out to be incorrect (some 
of the interesting cases are documented by examples in Section H]). We also 
show that for each g & {0, 1}, the sets of all configurations where player □ (or 
player O) has a (l>^)-winning (or (j2^>^)-winning) strategy is effectively regu- 
lar, and the corresponding finite-state automaton is effectively constructible 
by a deterministic polynomial-time algorithm (for the '>!' constraint, the 
algorithm needs NPfl co-NP oracle). Finally, we also give algorithms which 
compute winning strategies if they exist. These strategies are memoryless, 
and they are also effectively regular in the sense that their functionality is 
effectively expressible by finite-state automata (see Definition 14. 3p . Hence, 
winning strategies in stochastic BPA games with qualitative reachability ob- 
jectives can be effectively implemented. 
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For the sake of readability, some of the more involved (and long) proofs 
of Section have been postponed to Section [71 In the main body of the 
paper, we try to sketch the key ideas and provide some intuition behind the 
presented technical constructions. 

2. Basic Definitions 

In this paper, the sets of all positive integers, non-negative integers, ra- 
tional numbers, real numbers, and non-negative real numbers are denoted by 
N, No, Q, M, and M^°, respectively. For every finite or countably infinite set 
S, the symbol S* denotes the set of all finite words over S. The length of a 
given word u is denoted by \u\, and the individual letters in u are denoted by 
m(0), ■ ■ ■ ,u{\u\ — l). The empty word is denoted by e, and we set \e\ = 0. We 
also use to denote the set S* \ {e}. For every finite or countably infinite 
set M, a binary relation — )■ C M x M is total if for every m E M there is 
some n E M such that m ^ n. A path in = (M, — t-) is a finite or infinite 
sequence w = mQ,mi, . . . such that m, — )■ mj+i for every i. The length of 
a finite path w = mo, . . . ,mj, denoted by \w\, is i + 1. We also use w{i) 
to denote the element mj of w, and Wi to denote the path mi,mi^i, . . . (by 
writing w{i) = m or Wi we implicitly impose the condition that \w\ > 
A given n G M is reachable from a given m G M, written m — )■* n, if there 
is a finite path from m to n. A run is an infinite path. The sets of all finite 
paths and all runs in Ai are denoted by FPath{A4) and Run{A4), respec- 
tively. Similarly, the sets of all finite paths and runs that start in a given 
m E M are denoted by FPath{Ai,m) and Run{Ai,m), respectively. 

Now we recall basic notions of probability theory. Let A be a finite or 
countably infinite set. A probability distribution on A is a function f : A ^ 
M-" such that YlaeA fi^) = 1- A distribution / is rational if /(a) G Q for 
every a E A, positive if /(a) > for every a E A, Dirac if /(a) = 1 for some 
a E A, and uniform if A is finite and /(a) = for every a E A. The set of 
all distributions on A is denoted by T>{A). 

A a-field over a set X is a set J-" C 2^ that includes X and is closed 
under complement and countable union. A measurable space is a pair {X, J-") 
where X is a set called sample space and J-" is a a-field over X. A probability 
measure over a measurable space {X, J-") is a function V : T ^ M-° such 
that, for each countable collection of pairwise disjoint elements of J-', 

'P(IJjgj-Xj) = "^i^jV^Xi), and moreover V{X) = 1. A probability space is a 
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triple {X, T, V) where (X, JF) is a measurable space and is a probability 
measure over (X, J^). 

Definition 2.1. A Markov chain is a triple M. — (M, — >■ , Proh) where M 
is a finite or countably infinite set of states, — > ^ M x M is a total 
transition relation, and Proh is a function which to each s E M assigns a 
positive probability distribution over the set of its outgoing transitions. 

In the rest of this paper, we write s — ^ t whenever s — > t and 
Prob{{s,t)) = X. Each w G FPath{M.) determines a basic cylinder 
Run{A4,w) which consists of all runs that start with w. To every s E M 
we associate the probability space {Run{M., s), .F, V) where is the cr- field 
generated by all basic cylinders Run{A4,w) where w starts with s, and 
V : T ^ M-° is the unique probability measure such that V{Run{M., w)) = 
n™ Q^ajj where w = sq, - ■ ■ ,Sm and Sj Sj+i for every < i < m (if m = 0, 
we put V{Run{M,w)) = 1). 

Definition 2.2. A stochastic game is a tuple G = (V, i->- , {Vfj, Vo, Vq), Proh) 
where V is a finite or countably infinite set of vertices, i— )■ V xV is a 
total edge relation, (V^, l^o, ^o) partition ofV, and Proh is a probability 
assignment which to each v & Vq assigns a positive probability distribution 
on the set of its outgoing edges. We say that G is finitely branching if for 
each V &V there are only finitely many u &V such that v^u. 

A stochastic game G is played by two players, □ and O, who select the 
moves in the vertices of and V^, respectively. Let © e {□, O}. A strategy 
for player in G is a function which to each wv G V*Vq assigns a probability 
distribution on the set of outgoing edges of v. The sets of all strategies for 
player □ and player O in G are denoted by Eg and Ug (or just by S and 11 if 
G is understood), respectively. We say that a strategy r is memoryless (M) 
if t{wv) depends just on the last vertex t;, and deterministic (D) if t{wv) is 
a Dirac distribution for all wv. Strategies that are not necessarily memory- 
less are called history- dependent (H), and strategies that are not necessarily 
deterministic are called randomized (R). Thus, we define the following four 
classes of strategies: MD, MR, HD, and HR, where MD C HD C HR and 
MD C MR C HR, but MR and HD are incomparable. 

Each pair of strategies (cr, tt) G S x H determines a unique play of the 
game G, which is a Markov chain G((T, tt) where is the set of states, and 
wu wuu' iff M m' and one of the following conditions holds: 
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• "U G and a{wu) assigns x to -u i— u', where x > 0; 

• u G Vo and n{wu) assigns x to m i— )■ u', where a; > 0; 

• u G Vq and u u'. 

Let T C y be a set of target vertices. For each pair of strategies (cr, vr) G 
S X n and every v e V, let V^''^{Reach{T,G)) be the probabihty of all 
w G Run{G{a,'n'),v) such that w visits some u & T (technically, this means 
that w{i) G V*T for some i G Nq). We write V^^'^ {Reach{T)) instead of 
V^^^{Reach{T,G)) if G is understood. 

We say that a given f G has a value in G if 

sup^gsinf7rGn^^'''(^eac/i(r)) = inf^gn sup^g^ ^^'''(^eac/i(r)). If v 
has a value, then val{v, G) denotes the value of v defined by this equality 
(we write just val{v) instead of val{v,G) if G is understood). Since the 
set of all runs that visit a vertex of T is obviously Borel, we can apply the 
powerful result of Martin [17] (see also Theorem 13. ip and conclude that 
every v & V has a value. 

Definition 2.3. Let e > and v E V. We say that 

• a E is ^-optimal (or ^-optimal maximizing j inv if V"''^ {Reach{T)) > 
val{v) — 6 for all n G 11; 

• 7T eH is e-optimal (or e-optimal minimizing j inv ifV^'^{Reach{T)) < 
val{v) + e for all a E "E. 

A 0-optimal strategy is called optimal. A (quantitative) reachability objec- 
tive is a pair (T, \>g) where T (1 V and \>g is a probability constraint, i.e., 
\> E {>, >} and g E [0,1]. If g E {0,1}, then the objective is qualitative. 
We say that 

• a eTj is (T, >f))-winning in v if V^''^ {Reach{T)) \> g for all tt G 11; 

• IT eH is (T, J2^>f)) -winning in v if V^''^ {Reach{T)) ^ g for all cr G S. 

The (T, >f))-winning region of player D, denoted by [T]^^, is the set of all 
V E V such that player □ has a (T, \>g) -winning strategy in v. Similarly, the 
(T, J2^f))-winning region of player O, denoted by [T]'^^, consists of all v E V 
such that player O has a (T, ^g) -winning strategy in v. 

When writing probability constraints, we usually use <1, =1, and =0 
instead o/^l, >1, and ^0, respectively. 
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Figure 1: A game which is not determined. 
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3. The Determinacy of Stochastic Games with Reachabihty Objec- 
tives 

In this section we show that finitely-branching stochastic games with 
quantitative/ qualitative reachability objectives are determined in the sense 
that for every quantitative reachability objective (T, >g), each vertex of the 
game belongs either to [T]q^ or to [Tf^^ (see Definition 12 3p . Let us note that 
this result cannot be extended to general (infinitely-branching) stochastic 
games. A counterexample is given in Figure [H where T = {t} is the set 
of target vertices. Observe that val{s) = 0, val{u) = 1, and val{v) = 1/2. 
It is easy to check that none of the two players has an optimal strategy in 
the vertices v, u, and s. Now suppose that player □ has a (T, l>i)-winning 
strategy a in v. Obviously, there is some fixed e > such that for every vr G 11 
we have that V^''" {Reach{T)) = 1 — e. Further, player O has a strategy tt 
which is |-optimal in every vertex. Hence, V^''^ {Reach(T)) < |, which is 
a contradiction. Similarly, one can show that there is no (T, J2^>|)-winning 
strategy for player O in v. 

For the rest of this section, let us fix a game 
G = (V, I— 7- , (Vq, Vo, Vq), Proh) and a set of target vertices T. Also, for every 
n G No and every pair of strategies (a, vr) G E x 11, let V"''^ {Reachn{T)) be the 
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probability of all runs w G Run{G{a, vr), such that w visits some m G T in 
at most n transitions (clearly, V^''^ {Reach(T)) = lim„_j.oo 'P^''^(-Reac/in(T))). 

To keep this paper self-contained, we start by giving an elementary proof 
of Martin's weak determinacy result (see Equation [1]) for the special case of 
games with reachability objectives (observe that the game G fixed above is 
not required to be finite or finitely-branching). 

Theorem 3.1. Every v G V has a value. Moreover, ifG is finitely-branching, 
then there is a MD strategy vr G 11 which is optimal minimizing in every 
vertex. 

Proof. Let {V — ?■ [0, 1], be the complete lattice of all functions f : V ^ 
[0, 1] with component-wise ordering. We show that the tuple of all values is 
the least fixed-point of the following (Bellman) functional V : (\^ — ?■ [0, 1]) — )■ 
{V [0,1]) defined by 



1 ifv eT 

sup{/(m) I f (-7- m} if f G Vn \ T 
M{f{u)\v^u} iiveVo\T 



Since V is monotone, by Knaster-Tarski theorem [18| there is the least 
fixed-point /iV of V. Let A : V ^ [0,1] he a. function defined by 
A{v) — swp^^^inf-^^]i7^^'^[Reach(T)). We prove the following: 

(i) ^ is a fixed point of V. 

(ii) For every e > there is tt G LI such that for every f G V" we have that 

supP;'"(/Zeac/i(T)) < fiV{v) + e (2) 

o-GS 

Observe that (i) implies jjViv) < sup^g^ i'^f-TGn 'P^''^(-Reac/i(T)). Obviously, 
sup inf P„^'^(/?eac/i(T)) < inf sup P;'^(/?eac/i(r)) 

and due to (ii) we further have that inf^gn sup^^^^ P^'''(i?eac/i(T)) < /iV(f). 
Hence, (i) and (ii) together imply that f^V{v) is the value of v for every v & V. 
It remains to prove (i) and (ii). 

Ad (i). Let v eV. IfveT, then clearly A{v) = 1 = V{A){v). Ifv^T, 
we can further distinguish three cases. 
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veVu- Then 



V{A){v) = svip{A{u) \ V ^ u} 

— sup{sup^£2 i^fvrGH '^u''^(-Reac/i(T)) \v^u} 
= sup^gs inf.en V^^"" {Reach{T)) 
= ^(^) 

e Fo- Let us denote by 'D(i') the set of all positive probabihty 
distributions on the set of outgoing edges of v. Then 

V{A){v) = inf{^(?7,) I ^-^m} 

= inf{sup^g5] inf^ren 'P^'''(-Reac/i(T)) \v^u} 

inf^g75(^) Y.v^u V{v ^ u) ■ sup^gs inf^en V^^^ {Reach{T)) 
sup^gs inf^sDM Viv ^ u) ■ inf^ren V^''' {Reach{T)) 

= sup^gsinf7ren7^r''(-Reac/i(T)) 

In the equality (*), the '>' direction is easy, and the '<' direction can 
be justified as follows: For every 6 > 0, there is a strategy a e T, such 
that for every u &V we have that 

sup inf V^'^'iReachiT)) < inf V^'"" (Reach(T)) + 5 
This means that, for every t] &T>{v) 

V r]{v ^ u)-sup inf V^'"" {Reach{T)) <y^r]{v^ u)-mf V^'"" {Reach{T))+S 
— ' (tgs "'Gn — ' Tren 

and thus 

inf V ri{v ^ u)-sup inf P^'"" {Reach {T)) < inf V ri{v ^ u)- inf r^'"" {Reach{T))+d 

which implies (*) because 5 was chosen arbitrarily. 
veVo- Then 

E.^„ ^ • sup,gs infTren V^'^ {Reach{T)) 
sup^es inf.en E.A« ^ " VZ^^ {Reach{T)) 

= sup^gs infTren V^''' {Reach(T)) 

= A{v) 

Note that the equality (**) can be justified similarly as (*) above. 
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Ad (ii). Let us fix some £ > 0. For every j G Nq, we define a strategy 
TTj as follows: For a given wv G V*V<y, we choose (some) edge u & V such 
that /iV(-u) < /iV(f) + 2M+j+i and put 'Kj{wv){v u) = 1. Note that such 
an edge must exist, and if G is finitely-branching, then there is even an edge 
v^u such that iiV{u) = /xV(f) (i.e., when G is finitely-branching, we can 
also consider the case when £ = 0). In the sequel we also write tt instead of 
ttq. We prove that for all cr G E, v G V, and i > we have that 

V^'^' {Reachi{T)) 

In particular, for j = we get 

V^'^'iReachiiT)) 

and hence 

supV^'''{Reach{T)) = sup lim V^'"" {Reachi{T)) < fj,V{v) + e 

If w G T, then Vv"' {Reachi{T)) = 1 = ijV{v) for aU j G Nq. If ^ T, we 
proceed by induction on i. If i = 0, then Vv'""^ {ReachoiT)) = < iiV{v) for 
all j G Nq. Now assume that i >1. For every cr G E, we use to denote 
the strategy such that ay{wu) — a{vwu) for all wu G V*Va. We distinguish 
three cases. 

(a) ^; G ^o. Then 

Vv""' (Reachi(T)) 



(b) f G Vo- Then 

Vv"\Reachi{T)) 
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k=j+l 



< /^vH + Ei3+2^ 



= E.^„vr(t;)(t;^..) .K'"'^^-+^(i?eac/i.-i(T)) 

< E.^.^(^)(^ ^ ^) ■ (/^V(t.) + ES+2 ^) 
= (E.^„ '^{v){v ^ u) . ^V{u)) + E£;h-2 # 

< /iV(t;) + + Ei5+2 # 

< /^v(^) + e£;+i ^ 



(c) V eVo. Then 

Vr'{Reach,{T)) = a; ■ '"^ (/?eac/i,_i(T)) 

= /iV(^;) + f 

If G is finitely branching, then an optimal minimizing strategy vr is ob- 
tained by considering e = in the above proof of (ii). □ 

Lemma 3.2. If G is finitely-branching , then for every v & V we have that 

Ve>0 3a G S 3n e N Vtt G H : V^'"" {Reachn{T)) > val{v) - e 

Proof. For all f G and i G Nq, we use Vi{v) to denote the value of v in 
G with "reachability in at most i-steps" objective. More precisely, we put 
Vi{v) = 1 for all f G T and i G No- If ^ T, we define Vi{v) inductively 
as follows: Vo{v) = 0, and Vi+i(f) is equal either to max{Vi(f) | v\-^u}, 
min{Vi(M) I f H-j- m}, or ^'"l^il^)' depending on whether v E V^, v E Vo, 

or f G Vq, respectively. 

A straightforward induction on i reveals that 

Viiv) = maxminV^''' {Reachi{T)) 

Also observe that, for every z G No, there is a fixed HD strategy (Tj G S such 
that for every vr G 11 and every f G \^ we have that Vi{v) < V"^''^ {Reachi{T)) . 
Further, put Voo(f) = \im.i^^Vi{v) (note that the limit exists because the 
sequence Vo(f ), Vi(f ), . . . is non-decreasing and bounded). We show that Voo 
is a fixed point of the functional V defined in the proof of Theorem 13.11 
Hence, /xV(w) < Voo{v) for every v eV , which implies that for every e > 
there is n G N such that for every vr G 11 we have that 

V^^'^ {Reachn{T)) > V;(f ) > /iV(t;) -e= valiv) - e 

So, it remains to prove that V(Voo) = Voo- We distinguish three cases. 

(a) f G Vn. Then 

V(Voo)(^) = max lim Vi{u) = lim maxVi(M) = lim Vi+i(f) = Voo(^) 
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In the second equality, the '<' direction is easy, and the '>' direc- 
tion can be justified as follows: For every u & V, the sequence 
Vi(m), V2(m), . . . is non-decreasing. Hence, for all i G N and u & V 
we have that linij^oo Vj(u) > Vi{u) and thus max^„_s.„ limj_^oo Vj (u) > 
maXt,H^„ Vi(M) which implies the '>' direction. 

(b) V eVo- Then 

V(Voo)(f) = min lim Vi{u) = lim min Vi(M) = lim Vi+i(f) = Voo{v) 

VI— i^OQ i— ^OO VI— 2— ^oo 

In the second equality, the '>' direction is easy, and the '<' direction 
can be justified as follows: For every 6 > there is i G N such that 
for every v^u we have that limj_>.oo — S< Vi{u) (remember that 
G is finitely-branching). It follows that min^,_>.„ limj^oo — S < 

min^^„ Vi{u) and thus min^^„ linij^oo Vj{u)-6 < hm^^oo min^^„ Vi{u) 
which implies the '<' direction because 6 was chosen arbitrarily. 

(c) V eVo. Then 

V(Voo)(t;) = Vx-lim Viiu) = lim Vx-ViH = lim V^+liv) = Vooiv) 

' ' i—^oo i—^oo ' ' i—^oo 

X X 
VI-^U VI-^U 

by linearity of the limit. □ 

Now we can state and prove the promised determinacy theorem. 

Theorem 3.3 (Determinacy). Assume that G is finitely branching. Let 
(T, >g) be a (quantitative) reachability objective. Then V = [T]^^ l±l [Tf^^. 

Proof. First, note that we may safely assume that for each t & T there is 
only one out-going edge t H- 1 (this assumption simplifies some of the claims 
presented below). Let v E V. If ^ > val{v), then v G [T]'^^ because player □ 
has an e-optimal strategy for an arbitrarily small e > (see Theorem 13. ip . 
Similarly, if g < val(v), then v G [T]^^. Now assume that g = val{v). 
Obviously, it suffices to show that if player O does not have a (T, ^g)- 
winning strategy in i;, then player □ has a (T, l>f))-winning strategy in v. 
This means to show that 

Vtt G n 3(T G S : r^'^{Reach{T)) > g (3) 



14 



implies 

3cT e S Vtt 6 n : V^'^iReach{T)) > g 

If > is > or val{v) = 0, then the above imphcation follows easily. Observe 
that 

• if > is >, then ([3]) does not hold, because player O has an optimal 
minimizing strategy by Theorem I3.lt 

• for the constraint >0, the statement is trivial. 

Hence, it suffices to consider the case when > is > and g = val{v) > 0. 
Assume that holds. We say that a vertex u & V is good if 

Vtt G n 3a G S : V^'^ {Reach{T)) > val{u) (4) 

Note that the vertex v fixed above is good by ([3]). Further, we say that an edge 
u I— 7- u' of G is optimal if either u & Vq, ot u E U V<:^ and val{u) = val{u'). 
Observe that for every m G U Vo there is at least one optimal edge u t— )■ u', 
because G is finitely branching (recall that the tuple of all values is the least 
fixed-point of the functional V defined in the proof of Theorem 13. ip . Further, 
note that if m G is a good vertex, then there is at least one optimal edge 
M H- m' where u' is good (otherwise we immediately obtain a contradiction 
with (jlj); also observe that if m G T, then mi— )■ m by the technical assumption 
above). Similarly, ii u E V<y is good then for every optimal edge u^u' we 
have that u' is good, and if m G Vq is good and u^u' then u' is good. 
Hence, we can define a game G, where the set of vertices V consists of all 
good vertices of G, and for all u,u' eV we have that (u, u') is an edge of G 
iff u i-> u' is an optimal edge of G. The edge probabilities in G are the same 
as in G. The rest of the proof proceeds by proving the following three claims: 

(a) For every u eV we have that val{u, G) = val{u, G). 

(b) There is a G S^j such that for every vf G H^ we have that 
V^^^{Reach{T,G)) > val{v,G) = g. 

(c) The strategy a can be modified into a strategy a G such that for 
every vr G Hg we have that V^''^{Reach{T, G)) > g. 

We start by proving Claim (a). Let u E V. Due to Theorem 13. H there 
is a MD strategy vr G Hq which is optimal minimizing in every vertex of G 
(particularly in u) and selects only the optimal edges. Hence, the strategy 
TT can also be used in the restricted game G and thus we obtain val{u, G) < 



15 



val{u,G). Now suppose that f a/ (m, G) < val{u,G). By applying Theorem [XT] 
to G, there is an optimal minimizing MD strategy 7f G Hq. Further, for every 
vertex t of G which is not good there is a strategy TTt G Hg such that for 
every a G Sg we have that Vt'^*{Reach{T,G)) < val{t,G) (this follows 
immediately from (jlj)). Now consider a strategy it' G Hq which for every 
play of G initiated in u behaves in the following way: 

• As long as player □ uses only the edges of G that are preserved in G, 
the strategy vr' behaves exactly like the strategy vf. 

• When player □ uses an edge r i— )■ r' which is not an edge in G for the 
first time, then the strategy vr' starts to behave either like the optimal 
minimizing strategy tt or the strategy vr^/, depending on whether r' is 
good or not (observe that if r' is good, then val{r', G) < val{r, G)). 

Now it is easy to check that for every a G we have that 

V^''^ {Reach[T,G)) < val{u,G), which contradicts the assumption that u 
is good. 

Now we prove Claim (b). Due to Lemma |3.2[ for every u & V we can 
fix a strategy o"„ G Sg and n^j G N such that for every vr G Ilg we have 
that Vl^'^{ReachnST,G)) > val{u,G)/2. For every k G Nq, let B{k) be the 
set of all vertices u reachable from f in G via a path of length exactly k 
which does not visit T. Observe that B{k) is finite because G is finitely- 
branching. Further, for every i G No we define a bound mj G N inductively 
as follows: mo = 1, and mj+i = rrii + maxln^j | u G B{mi)}. Now we 
define a strategy a G Eg which turns out to be (T, >f))-winning in the 
vertex v of G. For every w G V*Vfj such that rrii < \w\ < mj+i we put 
ai^w) = au{uw2), where w = W1UW2, \wi\ = rrii — 1 and u & V. Now it 
is easy to check that for every z G N and every strategy vf G Ilg we have 
that V^''^{Reachm,(T,G)) > (1 — ^)g. This means that the strategy a is 
(T, >^))-winning in v. 

It remains to prove Claim (c). Consider a strategy a G which for 
every play of G initiated in v behaves as follows: 

• As long as player O uses only the optimal edges, the strategy a behaves 
exactly like the strategy a. 

• When player O uses a non-optimal edge r r' for the first time, the 
strategy a starts to behave like an e-optimal maximizing strategy in 
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r', where e = {val{r',G) — val{r,G))/2. Note that since n-^r' is not 
optimal, we have that val{r',G) > val{r,G). 

It is easy to check that a is (T, >^))-winning in v. □ 

4. Stochastic BPA Games 

Stochastic BPA games correspond to stochastic games induced by state- 
less pushdown automata or 1-exit recursive state machines (see Section [1]). 
A formal definition follows. 

Definition 4.1. A stochastic BPA game is a tuple A = 
(r, "—^ , (Fq, To, Fq), Proh) where T is a finite stack alphabet, <— )■ C F x F-^ 
is a finite set of rules (where F-^ = {w eT* : \w\ < 2} j such that for each 
A G F there is some rule X -—^ a, (F^, Fo, Fq) is a partition of T , and Proh 
is a probability assignment which to each A G Fq assigns a rational positive 
probability distribution on the set of all rules of the form A a. 

A configuration of A is a word a G F*, which can intuitively 
be interpreted as the current stack content where the leftmost sym- 
bol of a is on top of the stack. Each stochastic BPA game 
A = (F, , (Fq, Fo, Fq), i'ro^) determines a unique stochastic game 
Ga = (F*, , (FnF*, FoF*, FqF* U {e}), Proh a), where the edges of h-> are 
determined as follows: and A/3 i— )■ a/3 iff A a. The probability 

assignment Proh a is the natural extension of Prob, i.e., e^e and for all 
A G Fq we have that A/3 a/3 iff A ^ a. The size of A, denoted by |A|, 
is the length of the corresponding binary encoding. 

In this section we consider stochastic BPA games with qualitative reach- 
ability objectives (T, ]>q) where T C F* is a regular set of configurations. 
For technical convenience, we define the size of T as the size of the mini- 
mal deterministic finite-state automaton = {Q, Qo, F) which recognizes 
the reverse of T (if we view configurations as stacks, this corresponds to the 
bottom-up direction). Note that the automaton can be simulated on-the- 
fiy in A by employing standard techniques (see, e.g., [l3|)- That is, the stack 
alphabet is extended to F x Q and the rules are adjusted accordingly (for ex- 
ample, if A 7- YZ, then for every q E Q the extended BPA game has a rule 
(A, q) (y, r)(Z, q) where 5{q, Z) = r). Note that the on-the-fiy simulation 
of s^T in A does not affect the way how the game is played, and the size 
of the extended game is polynomial in |A| and \£^t\- The main advantage 
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of this simulation is that the information whether a current configuration 
belongs to T or not can now be deduced just by looking at the symbol on 
top of the stack. This leads to an important technical simplification in the 
definition of T. 

Definition 4.2. We say that T CT* is simple if e ^ T and there is C F 
such that for every Xa e F+ we have that Xa G T iff X & Ft- 

Note that the requirement e in the previous definition is not truly re- 
strictive, because each BPA can be equipped with a fresh bottom-of-the-stack 
symbol which cannot be removed. Hence, we can safely restrict ourselves just 
to simple sets of target configurations. All of the obtained results (including 
the complexity bounds) are vahd also for regular sets of target configurations. 

Since stochastic BPA games have infinitely many vertices, even memory- 
less strategies are not necessarily finitely representable. It turns out that the 
winning strategies for both players in stochastic BPA games with qualitative 
reachability objectives are (effectively) regular in the following sense: 

Definition 4.3. Let A = (F, , (F^, Fo, Fq), Pro6) be a stochastic BPA 
game, and let e {□,<>}. We say that a strategy r for player is regular 
if there is a deterministic finite-state automaton ^ over the alphabet F such 
that, for every Xa e F0F*, the value of T{Xa) depends just on the control 
state entered by after reading the reverse of Xa (i.e., the automaton 
reads the stack bottom-up). Note that regular strategies are not necessarily 
deterministic. 

A special type of regular strategies are stackless MD (SMD) strategies, 
where T{Xa) depends just on the symbol X on top of the stack. Note that 
SMD strategies are deterministic. 

We use to denote the set TUje}, and we also shghtly abuse the notation 
by writing e instead of {e} (particularly in expressions such as Reach{e) or 

In the next sections, we consider the two meaningful qualitative probabil- 
ity constraints >0 and =1. We show that the winning regions [Tj^", [7"]^°, 
[T]=^, and [T]^^ are effectively regular. Further, we show that the member- 
ship to [T]>° and [T]^° is in P, and the membership to [T]=^ and [T]^^ is in 
NP n co-NP. Finally, we show that the associated winning strategies are 
regular and effectively constructible (for both players). 
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5. Computing the Regions [T]>^ and [T]=^ 

For the rest of this section, we fix a stochastic BPA game 
A = (r, ^ , (Fp, Fo, Fq), Prob) and a simple set T of target configurations. 
Since we are interested only in reachability objectives, we can safely assume 
that for every R G F^-, the only rule where R appears on the left-hand side 
is R^ R (this assumption simplifies the formulation of some claims). 

We start by observing that the sets [T]>° and [T]^° are regular, and the 
associated finite-state automata have a fixed number of control states. 

Proposition 5.1. Let s^/ = [T]>o n F and ^ = [T^]>° n F. Then [T]>^ = 
^WF* and [T^]>° = ^*s^V* U ^* . Consequently, [T\f = F* \ [T]>o = 
(^\^)*U(^\^)*(F\^)F* and [T,\f = F*\[r,]>o = (^\^)*(F\^)F*. 

Proof. Note that <^ SS. We start by introducing some notation. For every 
strategy cr e E and every a e F*, let 

• (t[— a] be a strategy such that for every finite sequence of configurations 
7i, . . . , 7„, 7, where n > and 7 G F^F*, and every edge 7 1— )■ 5 we have 
that cr[-Q;] (71, . . . ,7„,7)(7i-)- 5) = cr(7iQ;, . . . , 7nQ;, 7q;)(7q; i-)- fe) 

• cr[-|-Q;] be a strategy such that for every finite sequence of config- 
urations 7ia, . . . , 7„q;, 70;, where n > and 70; G F^F*, and ev- 
ery edge 7q;i->- 5q; we have that (T[-|-Q;](7ia, . . . , 7„a, 7q;)(7q;i->- ^ol) — 
o-(7i,---,7n,7)(7^<^) 

By induction on the length of a G F*, we prove that a G [T]^° iff a G 
g§*^Y* . For a = £, both sides of the equivalence are false. Now assume 
that the equivalence holds for all configurations of length k and consider 
an arbitrary Xa G F"*" where \o.\ — k. If Xa G [7"]^°, then there are two 
possibilities: 

• There is a strategy a G E such that for all tt G H, the probability of 
reaching T without prior reaching a is positive in the play Ga(o', tt) 
initiated in Xa. Then cr[— a] is (T, >0)-winning in X, which means 
that X G [T]>o, i.e., X e ^. 

• There is a strategy o" G S such that for all tt G 11, the probability of 
reaching T is positive in the play Ga(c, tt) initiated in Xa, but for 
some TT G n, the configuration a is always reached before reaching T. 



19 



In this case, consider again the strategy o'[— a]. Then cr[— a] is (T^, >0)- 
winning in X, which means X G [T^]^*^, i.e., X ^ SS. Moreover, observe 
that the strategy o is (T, >0)-winning in a. Thus, a G [7"]^" and by 
induction hypothesis we obtain a G SS* s^^Y* . 

In both cases, we obtained Xa G ^*^r*. If Xa G =^*^r*, we can again 
distinguish two possibilities: 

• X ^ £^ and there is a (T, >0)-winning strategy o" G S for the initial 
configuration X. Then the strategy cr[+Q;] is (T, >0)-winning in Xol. 
Thus, Xa G [T]>o. 

• X 1^ SS and a G e^*^/r*. Then there exists a (T^, >0)-winning strategy 
o"i G S in X. By induction hypothesis, there is a (T, >0)-winning 
strategy (T2 G E in a. We construct a strategy & which behaves like 
(Ti[+q;] until a is reached, and from that point on it behaves like a^- 
Obviously, a' is (T, >0)-winning, which means that Xa. G [7"]^°. 

The proof of ^^\^ = ^WF* U is similar. □ 

Our next proposition says how to compute the sets ^ and 

Proposition 5.2. The pair {^,^) is the least fixed-point of the function 
F : (2^ X 2^) ^ (2^ x 2^) defined as follows: F{A, B) = (i, B), where 

A = TrU AU {X G Tn UTo I there is X --^ f3 such that /3 e B* AT*} 

U {X G To I for allX^ (5 we have that (5 G B* AT*} 

B = Ft U 5 U {X G To U To I there is X ^ p such that p G B*AT* U 5*} 

U {X G To I /or allX^ /3 we have that /3 G EMF* U B*} 

Proof For every i G No, let (^^,5^) = F^(0,0). The set 2^ x 2^ with the 
component-wise inclusion forms a finite lattice. The longest chain in this 
lattice has length 2|r| + 1. Since F is clearly monotone, by Knaster-Tarski 
theorem {^p, ^f) = (U?f(i U?fcl ^i) is the least fixed-point of F. We 
show that (afp, ^f) = (=2^^, 

We start with the "C" direction. We use the following notation: 

• for every X G s^f-, let Ia{X) be the least i G N such that X & Ai, 

• for every X G let /s(^) be the least i G N such that X G 5^; 
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• for every aY e ^*f^f, let I{aY) = max{{lA{Y)} U {Ib{Z) \ 
Z appears in a}); 

• for every (3 e F*, let price{f3) — mm{/(7) | 7 is a prefix of ^, 7 e 
^f^f}, where min(0)=oo. 

First observe that F^ is a subset of both ^ and For every X e 
{^F^^n)^^T, wefixsomeX^ a (the "74-rule") such that ;?hce (a) < Ia{X). 
It follows directly from the definition of F that there must by such a rule. 
Similarly, for every X G {jS§f H Fq) \ Ft, we fix some X a (the "5-rule") 
such that either price{a) < Ib{X), or a e and IbO^) < Ib{X) for every 
y of a. 

Now consider a MD strategy cr e E which for a given Xa e ^*pS^F^* H 
FqF* selects 

• an arbitrary outgoing rule if X e F^; 

• the 74-rule of X if X e s^f and Ia{X) = price{Xa); 

• the S-rule of X otherwise. 

We claim that a is (T, >0)-winning in every configuration of ^^j^f^*- In 
particular, this means that j^f Q ^ ■ To see this, realize that for every 
TT e n, the play Gf^iu^ tt) contains a path along which every transition either 
decreases the price, or maintains the price but decreases either the length 
or replaces the first symbol with a sequence of symbols whose /s-value is 
strictly smaller. Hence, this path must inevitably visit T after performing a 
finite number of transitions. 

Similar arguments show that o is (T, >0)-winning in every configuration 
of ^p^/^F* U ^*p. In particular, this means that ^ ^■ 

Now we prove the "5" direction, i.e., s^p 5 ^ and S§f 5 S§. Let us 
define the -norm of a given X e F, AOi(X), to be the least n such that for 
some (7 e E and for all tt e 11 there is a path in Ga{o; tt) of length at most 
n from X to T. Similarly, define the j3§-norm of a given X e F, Nb{X), to 
be the least n such that for some o" G S and for all vr G 11 there is a path in 
Ga(c, tt) of length at most n from X to (if there are no such paths, then 
we put Na{X) ~ 00 and Nb{X) = 00, respectively). 

It follows from Konig's lemma and the fact that the game is finitely 
branching that Na{X) is finite for every X G and Nb{X) is finite for 
every X e ^. Also note that for aU X G F we have that Na{X) > Nb{X). 
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We show, by induction on n, that every X ^ s.t. A^^(X) = n belongs 
to A„, and that every X ^ SS s.t. Nb{X) = n belongs to B^- The base case 
is easy since Na{X) = 1 iff Nb{X) = 1 iff X G Tr, and (Ai, Bi) = (Fr, Tt). 
The inductive step follows: 

• X G If X G (or X G To), then some (or every) rule of the 
form X <-)■ /3F7 satisfies /3 G F G -s/, Xa(F) < n, and Xb(Z) < n 
for all Z which appear in /3. By induction hypothesis, f3 G and 
Y G v4„_i. Hence, X G v4„. 

• X G i^. If X G Fq (or X G Fo), then some (or every) rule of the form 
X /3 satisfies one of the following conditions: 

- P = pY-f where Pe^*,Y eaf, Na{Y) < n, and Nb{Z) < n for 
all Z which appear in /3. By induction hypothesis, /3 G B*^_^ and 
Y G An-i- Hence, X G v4„ C 

— /3 G where Nb{Z) < n for all Z which appear in f3. By 
induction hypothesis, /3 G and hence X G -B„. 

□ 

Since the least fixed-point of the function F defined in Proposition 15.21 
is computable in polynomial time, the finite-state automata recognizing the 
sets [T]>o and [T] ^ are computable in polynomial time. Thus, we obtain 
the following theorem: 

Theorem 5.3. The membership to [T]^° and [T]^° is decidable in polyno- 
mial time. Both sets are effectively regular, and the associated finite-state 
automata are constructible in polynomial time. Further, there is a regular 
strategy cr G S and a SMD strategy tt G H constructible in polynomial time 
such that a and tt is (T, >0) -winning and (T, =0) -winning in every configu- 
ration of [T]^^ and [T]^°, respectively. 

Proof. Due to Proposition 15.21 it only remains to show that a is regular, vr 
is SMD, and both a and tt are effectively constructible in polynomial time. 
Observe that the MD strategy a defined in the proof of Proposition 15.21 is 
(T, >0)-winning for player □. Moreover, a is regular, because the price of a 
given configuration can be determined by an effectively constructible finite- 
state automaton which reads configurations from right to left. Since the price 
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of a given configuration is bounded by 2|r|, the automaton needs only 0(|r|) 
control states and can be easily computed in polynomial time. 

A SMD (T, =0)-winning strategy tt for player O is easy to construct. 
Consider a strategy vr such that for every Xa G FoT* we have that 

• if X G \ =2/), then 7T{Xa) selects an edge Xa ^ Pa where X /3 
and /3 G \ U \ £^)*{T \ ^)T*] 

• if X G (r \ then 7r(Xa) selects an edge Xa 13 a where X /3 
and/3 G J^)*(r\^)r*; 

• otherwise, vr is defined arbitrarily. 

It is easy to check that vr is (T, =0)-winning in every configuration of [T]^^ = 

Remark 5.4. vVote that Theorem \5. 3\ holds also for the winning regions [Tgj^^ 
and [Tg]^^. The argument is particularly simple in the case of [T^j^^ , where 
we only need to modify the strategy n constructed in the proof of Theorem \5.3\ 
so that if X & {^SS \ then -niXa) selects an edge Xa t-?- fia where X ^ /3 

and/3G (^\^/)*(r\,^)r*. 

6. Computing the Regions [T]^^ and [T]^^ 

The results presented in this subsection constitute the very core of this 
paper. The problems are more complicated than in the case of [T]^° and 
[T]^'^, and several deep observations are needed to tackle them. As in Sec- 
tion [5l we fix a stochastic BPA game A = (F, , (Fq, Fo, Fq), Prob) and a 
simple set T of target configurations such that, for every R & Tt, the only 
rule where R appears on the left-hand side is R"—^ R. 

The regularity of the sets [T]^^ and [T]^^ is revealed in the next propo- 
sition. 

Proposition 6.1. Let = [T^]^^ n F, ^ = [Te]=^ n F, ^ = [T]^^ n F, and 
^ = [T]=i n F. Then [T]=i = M*^T* and [T]^^ = ^WF* U 

Proof. We prove just the equality [T]^^ = SS*3iY* (a proof of the other equal- 
ity is similar). By induction on the length of a G F*, we show that a G [T]^^ 
iff a G SS*3iY* ^ using the notation cr[— a] and cr[+a] that was introduced in 
the proof of Proposition 15.11 For a = e, both sides of the equivalence are 
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false. Now assume that the equivalence holds for all configurations of length 
k, and consider an arbitrary Xa G where |a| = fc. If Xa G [T]^^, we 
distinguish two possibilities: 

• There is a strategy cr G S such that for all vr G 11, the probability of 
reaching T from Xa without prior reaching a is 1 in Ga(o", vr). Then 
(t[— a] is (T, =l)-winning in X, which means that X G [T]^^, i.e., 
X G ^. 

• There is a strategy ex G S such that for all vr G 11, the probability 
of reaching T from Xa in the play Ga(c, tt) is 1, but for some if G 
n, the configuration a is reached with a positive probability before 
reaching T. In this case, consider again the strategy o"[— a], which is 
(Tg, =l)-winning in X and hence X G Moreover, observe that the 
strategy a is (T, =l)-winning in a. Hence, a G [T]^^ and by applying 
induction hypothesis we obtain a G J3§*^r*. 

For the opposite direction, we assume Xa G ^*^r*, and distinguish the 
following possibilities: 

• X G ^ and there is a (T, =l)-winning strategy cr G S in X. Then 
cr[+a] is (T, =l)-winning in Xa. Thus, Xa G [T]^^. 

• X G and a G ^*^r*. Then there is a (T^, =l)-winning strategy 
o"i G S in X. By applying induction hypothesis, there is a (T, =1)- 
winning strategy (T2 G S in a. Now we can set up a (T, =l)-winning 
strategy in Xa, which behaves like (Ti[+a] until a is reached, and from 
that point on it behaves like cr2. Hence, Xa G [T]^^. □ 

By Theorem 13. 3[ ^ = T \ £/ and ^ = F \ Hence, it suffices to 
compute the sets and Further, observe that if the set ^ is computable 
for an arbitrary stochastic BPA game, then the set ^ is also computable 
with the same complexity. This is because X G [T]^^ iff X G [TeJo^, where 
[Tg]^^ is considered in a stochastic BPA game A obtained from A by adding 
two fresh symbols X and Z to Fq together with the rules X ^ XZ, Z^Z, 
and setting T = T. Hence, the core of the whole problem is to design an 
algorithm which computes the set j^. 

In the next definition we introduce the crucial notion of a terminal set of 
stack symbols, which plays a key role in our considerations. 



24 



Definition 6.2. A set M (1 T is terminal if the following conditions are 
satisfied: 

• Ft n M = 0; 

• for every Z & MCl (Fq UFq) and every rule of the form Z ^ a we have 
that a e M*; 

• for every Z G M fl F^ there is a rule Z "-^ a such that a G M* . 

Since tlie empty set is terminal and the union of two terminal sets is 
terminal, there is the greatest terminal set that will be denoted by C in the 
rest of this section. Also note that C determines a stochastic BPA game 
obtained from A by restricting the set of stack symbols to C and including 
all rules X a where X, a G C*. The set of rules of Ac is denoted by M-c • 
The probability of stochastic rules in Ac is the same as in A. 

Definition 6.3. A stack symbol Y E T is a witness if one of the following 

conditions is satisfied: 

(1) Y G inz'; 

(2) Y E C and Y G [e]^^, where the set [e]^^ is computed in Ac- 
The set of all witnesses is denoted by W . 

In the next lemma we show that every witness belongs to the set £/. 

Lemma 6.4. The problem whether Y E W for a given Y E T is in 
NP n co-NP. Further, there is a SMD strategy vr G FI constructive by a 
deterministic polynomial-time algorithm with NP fl co-NP oracle such that 
for allY eW and a eT, we have that Vy^ {Reach{Ts)) < 1. 

Proof. Let W2 be the set of all type (2) witnesses of A, and let Wi be the set 
of all type (1) witnesses that are not type (2) witnesses (see Definition 16. 3p . 

Let us first consider the BPA game Ac (note that Ac is constructible in 
polynomial time). By the results of [l^, there are SMD strategies a' and 
tt' in G{Ac) such that a' is {e, =l)-winning in every configuration of [e]^^ 
and tt' is (e, <l)-winning in every co nfig uration of [e]^^ (here the sets [e]^^ 
and [s]q^ are considered in Ac). In [l3|, it is also shown that the problem 
whether a given SMD strategy is {e, =l)-winning (or {e, <l)-winning) in 
every configuration of [e]^^ (or [e]^^) is decidable in polynomial time. Hence, 
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the problem whether a given Y E T belongs to W2 is in NPflco-NP, and the 
strategy vr' is constructible by an algorithm which successively fixes one of the 
available rules for every F G To H C so that the set [s]^^ remains unchanged 
when all of the other rules with Y on the left-hand side are removed from 
Ac. Obviously, this algorithm needs only (9(|Ac|) time to fix such a rule 
for every Y E T<y (1 C (i.e., to construct the strategy vr') if it is equipped 
with a NP n co-NP oracle which can be used to verify that the currently 
considered rule is a correct one. 

The strategy vr' can also be applied in the game G{A) (for every Z G 
To \ C we just define 7r'(Z) arbitrarily). Since fl C = 0, for all Y e W2 
and cr G S we have that Vy^ {Reach{T^)) < 1. 

The remaining witnesses of Wi can be discovered in polynomial time, 
and there is a SMD strategy n" G 11 constructible in polynomial time such 
that for all Y G Wi and a G S we have that Vy^ {Reach{T^)) = or 
Vy^ {Reach(W2T*)) > 0. This follows directly from Theorem 15.31 and Re- 
mark 15.41 

The strategy vr is constructed simply by "combining" the strategies vr' and 
tt". That is, vr behaves like vr' (or vr") in all configurations Ya where Y G W2 
{or YeWi). □ 

Due to Lemma 16.41 we have that W C s^. One may be tempted to 
think that the set ^ is just the attractor of W , denoted Att{W), which 
consists of all stack symbols from which player O can enforce visiting a 
witness with a positive probability. However, this (natural) hypothesis is 
false, as demonstrated by the following example: 

Example 6.5. Consider a stochastic BP A game A = 
({X,r,Z,i?}, -^,({X},0,{F,Z,i?}),Pro6), where X^X, X^Y, 
X^Z, Y-^Y, Z^Y, Z^R, R^R, and the set Tr contains just R. 
The game is initiated in X , and the relevant part of (reachable from X) 
is shown in the following figure: 



Observe that ^ = {X, Y,Z}, C = W = {Y}, but Att{{Y}) = {Z, Y}. 
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The problem is that, in general, player □ cannot be "forced" to enter 
Att{W) (in Example 16. 5t player □ can always select the rule X X and thus 
avoid entering Att{{Y})). Nevertheless, observe that player □ has essentially 
only two options: she either enters a symbol of Att{W), or avoids visiting the 
symbols of Att{W) completely. The second possibility is analyzed by "cutting 
off" the set Att{W) from the considered BPA game, and recomputing the set 
of all witnesses together with its attractor in the resulting BPA game which 
is smaller than the original one. In Example 16. 5[ we "cut off" the attractor 
Att{{Y}) and thus obtain a smaller BPA game with just one symbol X and 
the rule X "-^ X. Since that X is a witness in this game, it can be safely 
added to the set £/. In general, the algorithm for computing the set 
proceeds by putting =2/ := and then repeatedly computing the set Att(W), 
setting £/ := ^ U Att{W), and "cutting off" the set Att{W) from the game. 
This goes on until the set Att{W) becomes empty. 

We start by demonstrating that if ^ 7^ then there is at least one 
witness. This is an important (and highly non-trivial) result, whose proof is 
postponed to Section 17.11 

Proposition 6.6. //^/ 7^ 0, then W ^(I). 

In other words, the non-emptiness of ^ is always certified by at least 
one witness, and hence each stochastic BPA game with a non-empty =2/ can 
be made smaller by "cutting off" Att{W). The procedure which "cuts off" 
the symbols Att{W) is not completely trivial. A naive idea of removing the 
symbols of Att{W) together with the rules where they appear (this was used 
for the stochastic BPA game of Example 16. 5p does not always work. This is 
illustrated in the following example: 

Example 6.7. Consider a stochastic BPA game A = 
({X, Y, Z, R}, ^ , ({X}, 0, {Y, Z, R}), Proh), where 

X ^ X, X ^Y, X ^ ZY, r A r, z ^ x, z ^ i?, r^ r 

and Tt = {R}. The game is initiated in X (see Fig. [^. We have that 
= {Y} (observe that X,Z,R & [Ti,]'^^ , because the strategy a of player □ 
which always selects the rule X ZY is (T, =l)-winning). Further, we have 
that C = W = Att{W) = {Y}. If we remove Y together with all rules where 
Y appears, we obtain the game A' = ({X, Z, R}, t- , ({X}, 0, {Z, R}), Prob), 
where X ^ X , Z^ X , Z^ R, R^ R. In the game A' , X becomes a wit- 
ness and hence the algorithm would incorrectly put X into . 
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Figure 2: The game of Example 16.71 



Hence, the "cutting" procedure must be designed more carefully. Intu- 
itively, we do not remove rules of the form X -—^ ZY , where Y G Att{W), 
but change them into X ^ where the plays initiated in Z "behave" like 
the ones initiated in Z with the exception that e cannot be reached whatever 
the players do. 

Now we show how to compute the set formalizing the intuition given 
above. To simplify the proofs of our claims, we adopt some additional (safe) 
assumptions about the considered BPA game A. 

Definition 6.8. We say that A is in special normal form (SNF) if all of the 
following conditions are satisfied: 

• For every R G Tt we have that i? G Fq and R. 

• For every rule X ^ a where X G To U Fq tt;e have that a G F. 

• The set F^ can he partitioned into three disjoint subsets F[l], F[2], and 
F[3] so that 

— if X E F[l] and X "—^ a, then a G F; 

— if X G F[2], then X^e and there is no other rule of the form 
X ^ a; 

— if X G F[3], then X-^YZ for some Y,Z eT, and there is no 
other rule of the form X --^ a. 

Note that every BPA game can be efficiently transformed into an "equiv- 
alent" BPA game in SNF by introducing fresh stack symbols (which belong 
to player □) and adding the corresponding dummy rules. For example, if the 
original BPA game contains the rules X ^ e and X Y Z, then the newly 
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constructed BPA game in SNF contains the rules X E, X "-^ P, E"-^ e, 
P "-^ YZ, where E, P are fresh stack symbols that belong to player □. Ob- 
viously, the set £/ of the original BPA game is the set of the newly 
constructed BPA game restricted to the stack symbols of the original BPA 
game. 

So, from now on we assume that the considered BPA game A is in SNF. 
In particular, note that only player □ can change the height of the stack; and 
if she can do it, then she cannot do anything else for the given stack symbol. 

Our algorithm for computing the set ^ consists of two parts, the proce- 
dure Init and the procedure Main. The procedure Init transforms the BPA 
game A into another BPA game A, which is then used as an input for the 
procedure Main which computes the set ^ of A. 

For every X G F, let X be a fresh "twin" of^X, and let f = {X \ X eT}. 
Similarly, for every G {O, □} we put Fq = {X \ X G F0}. The 
procedure Init inputs the BPA game A and outputs another BPA game 
A = (f , , (f □, f o, f o), Prob) where f = F U F, f © = F© U Fq for every 
G {O, O, □}, and the rules are constructed as follows: 

• if X is a rule of A, then X ^ e and X X are rules of A; 

• if X 7- F is a rule of A, then X > F and X are rules of A; 

• if X ^ YZ is a rule of A, then X ^ YZ and X ^ YZ are rules of A; 

• A has no other rules. 

Further, if X F in A, then X ^ F and X ^ F in A. We put Ft = {R,R\ 
R G Ft}. 

Intuitively, the only difference between X and X is that X can never 
be fully removed from the stack. Also observe that the newly added stack 
symbols of F are unreachable from the original stack symbols of F. Hence, 
the set =2/ of A is obtained simply by restricting the set of A to the 
symbols of F. In the rest of this section, we adopt the following convention: 
the elements of F are denoted by X, F, Z, . . ., the corresponding elements of 
F are denoted by X, F, Z, . . ., and for every X G F, the symbol X denotes 
either X or X. 

The set ^ of A is computed by the procedure Main (see page [30]) . At 
lineini we assign to M the least fixed-point of the function Atte^vi^ '■ 2^" — )■ 2^, 
where O is an auxiliary BPA game maintained by the procedure Main and W 
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is a subset of stack symbols of B. The function Atte is defined as follows 
(the set of rules of 6 is denoted by ): 

(S) = W 

U {A G f o U f o I there is a rule B where B e S} 

U {Ae f [1] \ B eS ior aWA^ B} 

U {Ae f [3] \A^YC where Y e S or Y,C e S} 

Note that the procedure Main actually computes the sets and ^ of A 
simultaneously, as stated in the following proposition. A proof is postponed 
to Section [7^ 



Procedure Main 



Data: A BPA game A = (F, , (r^, To, Tq), Prob). 
Result: The sets W and W. 

1 >V:=0; W:=0;0:=A; 

2 while the greatest set W of witnesses in is not empty do 

3 
4 
5 



Ai := the least fixed-point of Atte,vy; 
for every A ^ Ai do 

1^ remove the symbol A and all rules with A on the left-hand side; 

for every rule A^ B where A & \ A4 and B & Ai do 
1^ remove the rule A^ B; 

for every rule A^ YC where A & T^ \ M. and C E M. do 
1^ replace the rule A^ YC with the rule A^Y; 

W ■.= WUM; 
W := W U {F I r G W}; 



10 

11 

12 return W,U 



Proposition 6.9. The sets W and lA computed by the procedure Main are 
exactly the sets and of the BPA game A, respectively. 

Now, let us analyze the complexity of the procedure Main. Obviously, 
the main loop initiated at line [2] terminates after (9(|A|) iterations. In each 
iteration, we need to compute the greatest set of witnesses W of the cur- 
rent game, which is the only step that needs exponential time. Hence, the 



30 



running time of the procedure Main is exponential in the size of A. Never- 
theless, the procedure Main can be easily modified into its non- deterministic 
variant Main-NonDet where every computation terminates after a polynomial 
number of steps, and all "successful" computations of Main-NonDet output 
the same sets VV,W as the procedure Main. This means that the member- 
ship problem as well as the non-membership problem for the set is in 
NP, which implies that both problems are in fact in NP fl co-NP. The 
same applies to the set The only difference between the procedures Main 
and Main-NonDet is the way of computing the greatest set of witnesses W. 
Due to Lemma I6.4[ the problem whether Y & W for a given y G F is in 
NP n co-NP. Hence, the membership as well as the non- membership to W 
is certified by certificates of polynomial size that are verifiable in polynomial 
time (in the proof of Lemma [67^ we indicated how to construct these certifi- 
cates, but this is not important now). The procedure Main-NonDet guesses 
the set W together with a tuple of certificates that are supposed to prove 
that the guess was fully correct (i.e., the guessed set is exactly the set of 
all witnesses). Then, all of these certificates are verified. If some of them 
turns out to be invalid, the procedure Main-NonDet terminates immediately 
(this type of termination is considered "unsuccessful"). Otherwise, the pro- 
cedure Main-NonDet proceeds by performing the same instructions as the 
procedure Main. 

Since the membership problem for the sets is in NP fl co-NP, 

the membership problem for the sets J^, ^ is also in NP fl co-NP (see the 
discussion at page Hence, an immediate consequence of the previous 

observations and Proposition 16.11 is the following: 

Theorem 6.10. The membership to [T]^^ and [T]^^ is in NP fl co-NP. 
Both sets are effectively regular, and the associated finite-state automata are 
constructive by a deterministic polynomial-time algorithm with NPCl co-NP 
oracle. 

Since the arguments used in the proof of Proposition 16.91 are mostly con- 
structive, the winning strategies for both players are effectively regular. This 
is stated in our final theorem (a proof can be found in Section [7^ . 

Theorem 6.11. There are regular strategies a G S and tt G H such that a 
is {T,=l)- winning in every configuration of [T]^^ and tt is {T, <1)- winning 
in every configuration of [T]^^. Moreover, the strategies a and tt are con- 
structible by a deterministic polynomial-time algorithm with NP fl co-NP 
oracle. 
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7. Proofs of Section [6] 



In this section we present tlie proofs tliat were omitted in Section [6l 

7.1. A Proof of Proposition 1 6. 61 

We start by formulating a simple corollary to Proposition [5]2l which turns 
out to be useful at several places. 

Proposition 7.1. Let a & be a strategy of player □ which always returns 
a uniform probability distribution over the available outgoing edges. Then for 
every X G [T]^^ fl F (or X E [Te]^^ HT) and every vr G 11 there is a path w 
from X to T (to T^, resp.) in G/\{(J,tt) such that 

1. the length of w is at most 2^1^' ; 

2. the length of all configurations visited by w is at most 2\T\. 

Proof. Let us consider the sets Ai and Bi from the proof of Proposition 15.21 
Recall that [T]>^ n F = \jf^^ A, and [Te]>^ n F = U^fJ B^. By induction on z, 
we prove that for every X & Ai (or X G Bi) and every tt G 11 there is a path 
w from X to T (or to T^, resp.) in G/^{a, tt) such that 

(1) the length of w is at most 2*; 

(2) the length of all configurations visited by w is at most i. 

The case i = 1 is trivial, as = SS\ = Tt- Now assume that i > 1. If 
X G n (Fq UFq), then by the definition of Ai, there is a transition X 7 
such that 7 G Ft U Ai_iF U Bi_iAi_i U Ai_i. By induction hypothesis, there 
is a path w' from 7 to T in (ja(o"; tt) of length at most 2* + 2* = 2*+^ such that 
the length of all configurations entered by w' is at most max{z+l, i} = i + 1. 
The rest follows from the fact that a always returns a uniform probability 
distribution, and if X G fl F^, then all outgoing transitions of X have 
the form X 7 where 7 G F^- U Aj_iF U Bi_iAi_i U Ai_i (we use induction 
hypothesis to obtain the desired result). The case when X E Bi follows 
similarly. □ 

Proposition I6.6h s obtained as a corollary to the following (stronger) claim 
that will also be used later when synthesizing a regular (T, =l)-winning strat- 
egy for player □. 

Proposition 7.2. Let W be the set of all witnesses (see Definition \ 6.3\) . 
If W = ^, then there is a regular strategy a of player computable in 
polynomial time, which is {T^,=l)- winning in every configuration of A. 
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In particular, ii W = (/} then = and thus we obtain Proposition 16.61 
Now we prove Proposition 17. 2^ relaying on further technical observations that 
are formulated and proved at appropriate places. 

As W = ^, the two conditions of Definition 16.31 are not satisfied by any 
Y E T. This means that for all Y E C we have that Y G [s]^^, where the 
set [e]^^ is computed in A,^ (we again use Theorem I3.3p . Due to [l2| . there 
exists a SMD strategy ax for player □ in Gac such that for every Y E C 
and every strategy vr of player O in Gac have that V"^ {Reach{e)) = 1. 

Let (7(7 be the SMD strategy of player □ which always returns the uniform 
probability distribution over the available edges. In the proof we use the 
following simple property of au, which follows easily from Proposition 17.11 

Lemma 7.3. There is ^ > such that for every X G F and every vr G 11 
there is a path w from X to a configuration ofT^ in G^icruj ^r) satisfying the 
following: The length of all configurations visited by w is bounded by 2\r\, 
and the probability of Run {w) in Ga{o'u,tt) is at least C,- 

Proof. Since = 0, there are no type (1) witnesses (see Definition 16. 3p . i.e., 
r n [Te]^° = 0, which means that F C [Te]>° by Theorem [SJl Let vr G H 
be an arbitrary (possibly randomized) strategy. We define the associated 
deterministic strategy fc, which for every finite sequence of configurations 
ai, . . . , a„ selects an edge q;„ i— )■ /3 such that a„ t— )■ /3 is assigned a maximal 
probability in the distribution assigned to ai, . . . , q;„ by the strategy vr. In 
other words, a„ k-)- /3 is an edge selected with a maximal probability by it. 
If there are several candidates for i— )• (3, any of them can be chosen. Ob- 
viously, every path in Ga{(^u,t^) initiated in X is also a path in Ga{o'u,'^) 
initiated in X. Due to Proposition 17. there is a path w from X to Tg in 
Ga{cuj ^) such that the length of w is bounded by 2^l'"l and the stack height 
of all configurations visited by w is bounded by 2|F|. Now consider the cor- 
responding path w in GaI'^u,'^)- The only difference between w and w is 
that the probability of the transitions selected by player O is not necessarily 
one in w. However, due to the definition of w we immediately obtain that 
the probability of each such transition is at least (this bound is not tight 
but sufficient for our purposes). Since au is uniform, the same bound is valid 
also for the probability of transitions selected by player □. Let /x be the least 
probability weight of a probabilistic rule assigned by Prob. We put 
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Obviously, V{Run{w)) > C, and we are done. 



□ 



Now we are ready to define the regular strategy a G S whose existence was 
promised in Proposition 17.21 Recall that regular strategies are memoryless, 
and hence they can be formally understood as functions which assign to a 
given configuration /3 a probability distribution on the outgoing edges of /3. 
For a given Xa G r^r*, we put cr(Xa) = axiXa) if Xa starts with some 
(3 e C* where \f3\ > 2\T\. Otherwise, we put (r{Xa) = au{Xa). 

Observe that the strategy a can easily be represented by a finite state 
automaton with 0(|r|) states in the sense of Definition 14.31 Moreover, 
such an automaton is easily constructible in polynomial time because the 
set C is computable in polynomial time. So, it remains to prove that a is 
(Te, =l)-winning in every configuration of A. 

Let us fix some strategy vr G IT. Our goal is to show that for every a G F"*" 
we have that V^''^ {Reach{Ti,)) = 1. Assume the converse, i.e., there is some 
a G F+ such that V^^'" {Reach{T^)) < 1. 

Proof outline: Let w be a run of G/^{a,TT). We say that given rule of 
A is used infinitely often in w if the rule was used to derive infinitely many 
transitions of w. Further, we say that w eventually uses only a given subset 
^ of ^ if there is some z G N such that all transitions w{j) — >w{j+l), 
where j > i, were derived using a rule of 

We show that the set of all runs initiated in a that do not visit contains 
a subset V of positive probability such that all runs of V eventually use only 
the rules of Ac. Then, we show that player □, who plays according to the 
strategy a, selects the rules of Ac in such a way that almost all runs that 
use only the rules of Ac eventually terminate (i.e., visit the configuration e). 
However, this contradicts the fact that V contains only non-terminating runs. 
Now we elaborate this outline into a formal proof. 

Lemma 7.4. There is a set of runs V C Run{G i\{(T , tt) , a) such that 
V^'^iy) > 0, and for every w & V we have that w does not visit and 
all rules that are used infinitely often in w belong to M-c ■ 

Proof. Let A be the set of all w G Run{GA{o;7i),a) such that w does not 
visit Tg. By our assumption, V^''"{A) > 0. The runs of A can be split into 
finitely many disjoint subsets according to the set of rules which are used 
infinitely often. Since V^''"{A) > 0, at least one of these subsets V must have 
positive probability. Let y be the associated set of rules that are used 
infinitely often in the runs of V. 
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We prove that M-y C -^(^ . Let L C F be the set of all symbols that 
appear on the left-hand side of some rule in M-y • To show that M-y C , 
it suffices to prove that 

(a) for every Ye (L \ C) fl (Fq U Fq) we have that if Fm- /3, then also 

(b) for all rules Y M-y (3 we have that /3 G (L U C)*. 

Observe that (a) and (b) together imply that LUC is a terminal set. Hence, 
L U C = C by the maximality of C, and thus M-y C m-c as needed. 

Claim (a) follows from the fact that player □, who plays according to 
the strategy cr, selects edges uniformly at random in all configurations of 
{{L\C)f\Va)-T*. Then every rule F /3, where F G (L\(:7)n(FoUFn), has 
the probability of being selected greater than some fixed non-zero constant, 
which means that Y P (otherwise, the probability of V would be zero). 

Now we prove Claim (b). Assume that y'^y7. If 7 = e, then 7 G 
{L U Cy. If 7 = P, then surely P E L because configurations with P on the 
top of the stack occur infinitely often in all runs of V. If 7 = PQ, then P E L 
by applying the previous argument. If Q G C, we are done. Now assume 
that Q ^ C. Note that then player □ selects edges uniformly at random in 
all configurations of the form I3Q5 where < 2|F|. By Lemma [7. 3[ there is 
< ^ < 1 such that for every configuration of the form PQ5 there is a path 
w from PQ5 to T U {Q6} in G/^{a, ir) satisfying the following: 

• all configurations in w are of the form f3Q6 where < 2|F|; 

• the probability of following w in G'a(o", tt) is at least ^. 

It follows that every run of V enters configurations of {Q} ■ T* infinitely 
many times because every run of V contains infinitely many occurrences of 
configurations of the form PQ6 and no run of V enters T. Hence, Q E L. □ 

Now we prove that V^''^{V) = and obtain the desired contradiction. 
By Lemma 17.41 all runs of V eventually use only the rules of ■ Each 
run w E V uniquely determines its shortest prefix after which no rules 
of \ M-c are used and the length of each configuration visited after the 
prefix Vy^ is at least as large as the length of the last configuration visited 
by V. For a given finite path v initiated in a, let = {w E V \ = v}. 
Obviously, V = l^^U^. Since there are only countably many f's, it suffices 
to prove that V'^'^{Uv) = for every v. So, let us fix a finite path v initiated 
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in a, and let Y/3 be the last configuration visited by v. Intuitively, we show 
that after performing the prefix v, the strategies cr and vr can be "simulated" 
by suitable strategies a' and vr' in the game G^c so that the set of runs Uy 
is "projected" (by ignoring the prefix v and cutting off /3 from the bottom of 
the stack) onto the set of runs U in the play GA^{a', tt') so that 

Then, we show that Vy iU) = 0. This is because the strategy a' is "suffi- 
ciently similar" to the strategy ut, and hence the probability of visiting e in 
Gac('^') ^0 is 1. 

Now we formalize the above intuition. First, let us realize that every 
probability distribution / on the outgoing edges of a BPA configuration a 
determines a unique rule distribution fr on the rules of the considered BPA 
game such that for every a t— )■ a' we have that f{a t— t- a') = fr{Z t- 7), where 
Z 7 is the rule used to derive the edge a'. 

Observe that F G C by the definition of Let a' be a MR strategy for 
player □ in G^c such that for every 7 G we have that cr'{'~f) = a{'yP). 
Further, let vr' be a strategy for player O in Gac such that for all n G N and 
all cti, . . . , a„ G C* we have that the rule distribution of it'{Y, a„) is 

the same as the rule distribution of tt{v, ai/3, . . . , Observe that every 

run w ^ Uy determines a unique run Wc G Run{Y) in Gac(c"', tt') obtained 
from w by first deleting the prefix f (0), . . . , f (|f | — 2) and then "cutting off" 
/3 from all configurations in the resulting run. Let U = {wc | w G f/^}. Now 
it is easy to see that V'^''^{U.,) = V^''^ {Run{v)) ■ Vy'"'' {U). Note that all runs 
of U avoid visiting e. However, we show that almost all runs of G/^^{a' ,tt') 
reach e, which implies Vy (U) = and hence also V'^''^{Uy) = 0. 

Observe that the strategy a' works as follows. There is a constant k < 2|r| 
such that in every 7 G C"*", where I7I < k, player □ selects edges uniformly 
at random. Otherwise, player □ selects the same edges as if she was playing 
according to ax- We show that there is < ^ < 1 such that for every 7, where 
I7I < k, the probability of reaching e from 7 in Ga^{(j', vr') is at least ^. Note 
that if player □ was playing uniformly in all configurations, the existence of 
such a ^ would be guaranteed by Lemma [7731 However, playing according to 
ctt in configurations whose length exceeds k can only increase the probability 
of reaching e. Now note that almost all runs of RuniY) in Gi^^{a' ,71') visit 
configurations of the form 7 G C^, where I7I < fc, infinitely often. From this 
we obtain that almost all runs of Run{Y) in Gi^^{(t' ,ti') reach e. 
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1.2. Proofs of Proposition IKU and Theorem \6 . 1 1\ 

The procedure Main (see page [50]) starts by initializing W and lA to 0, 
and the auxihary BPA game B to A (the set of rules of G is denoted by 

). In the main loop initiated at line [2] we first compute the greatest set 
W of witnesses in the current game 0. At line [3l we assign to M. the least 
fixed-point of the function Atte vk- The BPA game B is then modified by 
"cutting off" the set M. at lines HHHl Note that the resulting BPA game is 
again in SNF and it is strictly smaller than the original 0. Then, the current 
sets W and U are enlarged at lines [TO|TT| and the new (strictly smaller) game 

is processed in the same way. This goes on until W and lA stabilize, which 
obviously requires only C(|A|) iterations of the main loop. 

Let K be the number of iterations of the main loop. For every Q < i < K , 
let 0j, Wj, and lAi be the values of 0, W, and U after executing exactly 

1 iterations. Further, Wi denotes the set of all witnesses in 0j, and A4.i 
denotes the least fixed-point of Atte^.w^i- The symbols Sj and Ilj denote the 
set of all strategies for player □ and player O in Gq^, respectively. Finally, 
Fj, , and denote the stack alphabet, the set of all rules, the set 
=2/, and the set of 0,, respectively. The edge relation of Gq^ is denoted 
by . Observe that 0o = A, >Vo = 0, = ^, ^0 = ^, Wk = 0, and 
WkjI^k is the result of the procedure Main. Let us note that in this section, 
the sets [T^]^^ and [T]^^ are always considered in the game A = 0o. 

We start by a simple observation which formalizes the relationship be- 
tween the symbols X and X in A = 0o. A proof is straightforward. 

Lemma 7.5. If X e [T,]<\ then X e [T]<\ 

Now we show that Wk = M) and Uk = ^o- For every < z < i^', 
let [T„i]<i = U*Wif* and [T,i]<^ = U*WiT* UU*. The "C" direction of 
Proposition 16.91 is implied by the following lemma: 

Lemma 7.6. For every < i < K , there are SMD strategies TT[Wi],7i[Ui] G 
IIo constructible by a polynomial-time algorithm with NP fl co-NP oracle 
such that 

(1) For every X G Wj and every ctq G Sq we have that 

V''^^^^'^\Reach{T,),Ge,) < 1 or V'^'^^'^HReachiXT^.i-^f^Ge,)) > 


(2) For every Y & Ui and every ctq G we have that 
Vf'^'^^'\Reach{T),Ge,) < 1 or P^"'"f"^l(i?eac/i([T, «-l]<i, Gqo)) > 
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(3) If i > 0, then 7r[Wi](X) = 7r[Wi_i](X) for every X G >Vi_i and 
7r[W,](F) = 7r[W,_i](F) for every Y E 

Proof. The strategies 7r[>Vj], 7r[Wj] are constructed inductively on i. In the 
base case, 7r[Wo], 7r[Wo] are chosen arbitrarily. Now assume that 7r[>Vi], 7r[Wj] G 
Ho have already been constructed. Due to Lemma 16. 4^ there is a SMD strat- 
egy TTi G Hi constructible by a deterministic polynomial-time algorithm with 
NP n co-NP oracle such that for every Z G Aii and every at G we have 
that V"2'^\Reach{T^,GQi-)) < 1. (Strictly speaking, Lemma guarantees 
the existence of a SMD strategy tTj G Hj such that the above condition is 
satisfied just for all Z G Wi. However, the strategy tTj of Lemma [6.41 can be 
easily modified so that it works for all Z G M.i = Uj^o ^^^e^ h/^C^)' whenever 
a new symbol A G To appears in Att:^'^^jy^,(0), we fix one of the rules B 
which witness the membership of A to Att-g^^^, (0).) The strategies 7r[VVi+i] 
and 7r[Wi+i] are defined as follows: 

• for every X G W„ we put 7r[>Vi+i](X) = 7r[W,](X); 

• for every X G W^+i \ = Mi, we put 7r[>V,+i](X) = Tii{X); 

• for every F G W^, we put 7r[Wi+i](F) = 7r[W,;](F); 

• for every Y G Wj+i the distribution 7r[Wj+i](F) selects the (unique) 
rule Y such that y-^jQ is the rule selected by TiiiY). 

Observe that for every < i < i^, the strategies vr[>Vi], vr[Wj] are constructible 
by a deterministic polynomial-time algorithm with NP H co-NP oracle. 

Now we show that Conditions (l)-(3) are satisfied for every < i < K . 
We proceed by induction on i. The base case {i = 0) is immediate, because 
Wo = Wo = 0. Now let us assume that vr[Wj], vr[W.j] satisfy Conditions (l)-(3). 
The strategies 7r[VVj+i], 7r[Z//j_|_i] obviously satisfy Condition (3). By induction 
hypothesis. Condition (1) and Condition (2) are satisfied for all elements of 
Wi and Ui, respectively. We verify that Condition (1) and Condition (2) 
are satisfied also for the remaining symbols of Wj+i \ Wi and Wj+i \ Ui, 
respectively. 

Condition (1). Let us fix some X G W^+i \ Wi and o"o ^ ^o- If 
V'^'^^^''^^\Reach{\T^,i\^^ ,Gq^^) > 0, we are done. Now assume that 
V'^°'^^^'+'\Reach{[T„i]<\Ge,-,)) = 0. We show that the strategy ao can 
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be "mimicked" by a strategy Uj e Ej so that 

V''j°'''^'^'+'\Reach{T,),eo) = Vf'''{Reach{T,),Qi) > (5) 

We construct the strategy ai so that the reachable parts of the plays 
Geo('^Oi 7r[VVi+i]) and G'e,(o"i, vtj) initiated in X become isomorphic. Let 
/ : f * ^ f * be a function defined inductively as follows: 

• /(£) = £; 

. ify/3e[T„^]<i, then /(F/3) = /(/?); 

• if /3 ^ [r„ then ? e f ^ (because Y ^ Wi) and we put f{Yj3) = 

• if F/3 ^ [Te,i]z^ and /3 G [T^Jj^S then Y eVi and we put f{Y(5) = 
Yf{f3) (observe that if F ^ f„ then Y e Wi and Y e Ui, which 
contradicts the assumption that Y(3 ^ [T£,i]^^). 

For every reachable state ao,...,aj of Geg{ao,n[yVi+i]) we put 
J-'{aQ, . . . , Oj) = f{ao), . . . , /(ttj) where / is the function defined above. Our 
aim is to setup the strategy so that T becomes an isomorphism. This 
means to ensure that for every reachable state ao, . . . , of Ggo(c'"0) "''"[^i+i]) 
we have that f{ao),...,f{aj) is a reachable state of GQ^{ai,Tri), and 

ao, . . . , aj-i ^ tto, • • • , implies /(ao), • • • , /(aj-i) ^ /("o), • • • , /("i)- 
We proceed by induction on j and define the strategy on the fly so that 
the above condition is satisfied. The base case (when j = 0) is immedi- 
ate, because J'{X) = f{X) = X and the root X has no incoming transi- 
tions. Now assume that ao, . . . , a^ is a reachable state of GQg{ao, 7r[>Vj+i]) 
such that ao, . . . , a^-i ao, . . . , aj. Then aj-i i-^o 0(j is an edge in Goo; 
which is assigned the probability x either by Prob, ttq, or ao, depending 
on whether the first symbol of aj-i belongs to Tq, To, or T^, respec- 
tively. By induction hypothesis, /(ao), . . . , /(aj_i) is a reachable state of 
G'Gj((Ji, TTj) and hence it suffices to show that /(a^-i) h- )-i /(a^) is an edge 
in Ge- which is assigned the same probability x by Prob, TTj, or the newly 
constructed (ii, respectively. Let aj-i = Afi. Note that since A(3 ^ [T^.i]^^, 
we have that f{aj_i) = /(A/3) = Af{l3), where i = A or i = A de- 
pending on whether /3 G [T^,-?]^^ or not, respectively If A G Tq, then 
aj = Bj3 for some B such that A^qB. But then also A^iB, where 
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B is either B or B depending on whether A = A ot A = A, respec- 
tively. Hence, = Af{l3)^iBf{l3) = f{Bl3) = f{aj) as needed. 
If y4 G To, we argue in a similar way, using the definitions of 7r[Wi+i] and 
TTj. The most complicated case is when A G Fq. It suffices to show that 
= Af{f3) f{cij). The distribution (Tj(/(ao), . . . , can then 
safely select the edge /(Q;j_i) i— )-j f{aj) with probability x. According to Def- 
inition 16. 8t we can distinguish the following three possibilities: 

• A e f[l]. Then aj = B(3 for some B such that A^qB. If A = A, 
then A = A, B = B, and a ^ [T^, z]^^ by the definition of /. Further, 
B ETi because otherwise B G Wj and B/S G [T^, i]^^, which contradicts 

the assumption P^°''^'^''*'^'(i?eac/i([T£, i]^-'^), 60) = 0. Hence, = 
Af{f3) h^iBf{f3) = f{aj) as needed. 

If A = A, then either A = A or A = A, and we consider these two cases 
separately. If A = A, then B = B and a G [Te,z]^^ by the definition 
of /. Further, i? G Fj because otherwise B EUi and thus Bf3 G [T^, i]^^, 
which contradicts the assumption P^''''^'^*^^^(i?eac/i([Te, i]^^), Bq) = 0. 
Hence^/(a,_i) = = f{BI3) = f{aj). If ^ = I, then 

B = B and i? G Fj, because otherwise -B G Wj and 5/3 G [T^,^]^^, 
which contradicts the assumption V'^'^^^^'^^\Reach{[Ts,i]^^),Qo) = 0. 
Hence, = = /(«,). 

• AG f[2]. Then aj = (5. \f A = A, then /3 ^ [Te,«]o^ 
and = h^, = /(a,). If i = I, then (3 G 
[Te,?]^^ by the definition of / which contradicts the assumption 

• A G f [3]. If i = A, then A = A and = BC(3 where 
A^qBC is the only available rule with A on the left-hand side. Fur- 
ther, we either have A-^iBC or A-^iB. In the first case we ob- 
tain B,C ^ Wi, which means that f{BCf3) = BCf{f3) and hence 
fiaj-i) = Af{/3) ^,BCf{l3) = f{aj). In the latter case, C^G Wi and 
B ^ W„ which means that B ^ Ui. Hence, f{BCf3) = Bf{f3) and 
/(a,_i)=^/(/3)^^5/(/3) = /(«,). 

If A = A, then either A = A 01 A = A. If A = A, then f3 G [Ts,i]o^ 
and aj = BCP where A^qBC is the only available rule with A on 
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the left-hand side. Further, we either have A^iB or A^iBC. In 
the first case, we have that C G Wj, hence C and C/3 G [Te,^]^^. 
This means /(5C/3) = and hence /(a^-i) = = 

f{oLj)- In the latter case, B,C ^ Wj, which means that C Wj and 
hence C ^Ui. Now realize that for every P G F we have that if P G Wj, 
then P &Ui. This follows directly from the "main" induction hypothesis 
(which states that Conditions (1) and (2) hold for the symbols of Wj 
and Wj, respectively) and Lemma |7.5[ From this and C ^ Ui we can 
conclude that C ^ Wj. This implies C/3 ^ [T^,?]^^, which means that 
f{BCf3) = BCf{f3) and hence = ^^BCf{f3) = f{a,). 

Condition (2). We proceed similarly as in the case of Condition (1). Let 
Y G \ Ui and (Tq G Sq. If P|''"'"'+^^(i?eac/i([T, z]<i, GqJ) > 0, we are 

done. Now assume that P|''^f"'+''(i?eac/i([T, i]<\ GqJ) = 0. We show that 
the strategy ctq can be "mimicked" by a strategy a.i G Sj so that 

P^"'"t"'+^^(i?eac/i(T),eo) = Vp'^^{Reach{T),ei) > (6) 

We construct the strategy (Tj so that the reachable parts of the plays 
Geo(co; 7r[Wi+i]) and Ge.{^i^'^i) initiated in Y and Y become isomorphic. 
Let / : F* — 7- F* be a function defined in the same way as / except that 
[T, i]^^ is used instead of [T^,i]^^. For every reachable state aQ,...,aj of 
G'eo('^05 7r[Wj+i]) we put ^{ao, ■ ■ ■ ,aj) = /(ao), . . . , /(aj) and define the 
strategy o"j so that the function J-" becomes an isomorphism. The rest of 
the proof is almost the same as for Condition (1). □ 

Note that an immediate consequence of Lemma 17.61 is the following: 

Lemma 7.7. For every < i < K we have that Wi C [T^]^^ andUi C [T]^^ 

Lemma [7.71 is proven by a trivial induction on z, using Lemma [7.61 in the 
induction step. Thus, the "C" direction of Proposition 16.91 is established. 
The opposite direction is shown in the next lemma. 

Lemma 7.8. We have that Wr H = and Uk^% = 0- 

Proof. Since Wk = 0, due to Proposition 17.21 there is a regular MR strategy 
(Ja' G Sa' which is (T^, =l)-winning in every a G T*^^. Moreover, the strategy 
(Jk is computable in time which is polynomial in the size of Ga- (assuming 
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that Qk has aheady been computed). Let = F \ Wk = and = 
T \ Uk- We show that the strategy ax can be efficiently transformed into 
regular MR strategies cro,(3"o G Eq such that o"o is (T^, =l)-winning in every 
configuration of and ctq is (T, =l)-winning in every configuration of 

^^^xf- In particular, this means that ^ [Ts]=^ and ^i^- C [T]=^ 
hence Wi^ H ^/q = and Uk r]% = (/} as needed. 

First we show how to construct the strategy a^. We start by defining 
(partial) functions g,h : T* inductively as follows: 

• gi^) = h{^) = £ 

{Yg{/3) HY e^K and /3 e ^rT* U {e}; 
Yh{l3) if F, F e and /3 G (F \ ^k)T*; 
± otherwise. 



HYP) 



g{Yf3) ifYe^K, 
h{(3) otherwise. 



A configuration a G F* is called g-eligible if g{a) 7^ _L. The strategy (Tq is 
constructed so that for every ^f-eligible Aa G f qF *, the following conditions 
are satisfied: 

• If A G r[l] and aK{g{Aa)) selects a rule A^kB with probability x, 
then ao{Aa) selects the rule A^qB with probability x. 

• If A G r[2] U r[3], then ao{Aa) selects the only available rule with 
probability 1. 

Note that the definition of ctq is effective in the sense that if the finite-state 
automaton s^^j^ associated with the regular MR strategy cxx (see Defini- 
tion 14. 3p has already been computed, then the finite-state automaton ^2^0 
associated with ctq simply "simulates" the execution of s^cr^ on the reverse of 
g{a) for every (yf-eligible a G F*. Hence, the automaton .a^o is constructible 
in polynomial time assuming that the BPA game has already been com- 
puted (cf. Proposition I7.2p . 

We show that for every (7-eligible initial configuration 7 G F* and every 
TTo G Ho we have that V!^°''^°{Reach{Ts),Qo) = 1. Assume the converse, 
i.e., there is a strategy ttq G FIq and a gf-eligible configuration 7 such that 
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p^o,^o(^jl(,ach{T^),Qo) < 1. We show that then there is a strategy ttk e 11^ 
such that 

P;»'-°(^eac/i(T,),eo) = V™ {Reach{T,),eK) = 1 (7) 

which is a contradiction. For every finite sequence of (jf-ehgible configura- 
tions ao, . . . ,an G f* such that = A(3 G fof*, the strategy tik selects 
a rule A'^k B in g{ao), . . . ,g{an) with probability x iff the strategy ttq se- 
lects a rule B in ckq, . . . , with probability x. We show that every 
reachable state a^, . . . , of the play G0„(o'o, ttq) initiated in 7 is a sequence 
of ^f-eligible configurations and the function Q over the reachable states of 
G^eo(o'O) ttq) defined by Q{aQ, . . . , aj) = g{ao), ■ ■ ■ , g{o.j) is an isomorphism 
between the reachable parts of the plays G0o(cro,7ro) and Ge^^ (cxi^, tt/^:) initi- 
ated in 7 and ^'(7), respectively. We proceed by induction on j. The base case 
is immediate. Now assume that ao, . . . , ccj is a reachable state of Gqq{o'q, ttq) 
such that «oi • • • i "^i-i <^0! • • • i <^i- Then ^-)■o is an edge in Geo; 
which is assigned the probability x either by Proh, ttq, or cxo, depending on 
whether the first symbol of cxj-i belongs to f q, f o, or f □, respectively. By 
induction hypothesis, ao, ■ • • , otj-i is a sequence of g'-eligible states and hence 
it suffices to show that g{aj_i) g{(^j) is an edge in Gq^ which is assigned 
the same probability x by Prob, ttk, or ax, respectively. Let cKj-i = A/3. We 
distinguish three possibilities: 

• A e To U To U r[l]. Then aj_i = A(3^^oB/3 = aj where A^qB. If 
g{Af3) = Ag{(3), then B G U {e} and since A-^k B, wc have that 
g{A/3) = Ag{(3) Bg{(5) =j{Bf5). If g{A/3) = Ah{/3), then A e 
and /3 e (r \ Mk)T*, hence A-^kB and g{Ap) = Ah{P) Bh{P) = 
g{BI3). It follows immediately from the definition of ao and ttk that the 
edges ttj-i i->-o Oij and g{aj^i) gioij) are assigned the same proba- 
bility. 

• A e f[2]. Then A ^ A and aj-i ^ A^h^o/3 = aj. Further, observe 
that A ^ Tk: because otherwise A'^kA is the only rule with A on 
the left-hand side, hence A G Wk and we have a contradiction. So, 

g{AI3) = Ag{P)v-^Kg{P)- Obviously, aj^i^^aj axid g{aj^i)^K gi^j) 
are assigned probability 1. 

• A G f [3]. Then = A/3 EC (3 = where A^q BC. If ^(A^) = 
Ag{j3), then /3 G 3§k^* U {e} and there are two possibihties: 
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— A'^K BC. By the definition of g, we have that g{BCf3) = 
BCg{/3), hence g{A/3) = Ag{/3) ^KBCg{/3) = g{BC/3) as needed. 

- A^kB. Then A,A e and C ^ ^k, hence g{BCP) = 
Bh{C/3) =^ Bg{f3) by the definition of g. Thus, = 

^KBg{P) = g{BCP). 

If c/(A^) = then A e and ^ e (f \ ^x)f*. Again, there 

are two possibihties: 

— A^kBC. By the definition of g, we have that g{BC(3) — 
BCh{p), hence g{Ap) = ^KBCh{p) = g{BCP). 

- A^kB. Then e and C ^ ^k, hence g{BCp) = 
S/i(C/3) =^ by the definition of g. Thus, c/(i^) = 
Ah{P) ^KBh{P)^g{BC(3). 

In all of the above discussed subcases, we have that the edges 
aj^i^oaj and g{aj_i)^K gioij) are assigned probability 1. 

Since every configuration of is ^f-eligible, the strategy cxo is 

(Tg, =l)-winning in every configuration of 

The definition of o"o and the proof that (Jq is (T, =l)-winning in every 
configuration of ^^Di^F* are very similar as in the case of (Tq- The main 
(and only) difference is the definition of the function g. Instead of g and h, 
we use partial functions g,h : T* ^ T*j^ defined as follows: 

• gi^) = K^) = £ 

{Yg{^) HY e^K and p e ^^^^f *; 
Yh{p) if y, y e and /3 i ^^^xf *; 
± otherwise. 

otherwise. 

The strategy (Tq and the function Q are defined in the same way as uq and 
using g instead of g. The strategy hk is defined in the same way as 
above. Observe that V'^^^'-^^'^ {Reach(T^),QK) = 1 and since (7(7) contains at 
least one symbol of f , we have that V'^^^i^'^ {Reach{s), Qk) = which means 
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that V'^^^'^'^ {Reach{T),Qx) = 1- The case analysis which reveals that Q is 
an isomorphism between the reachable parts of the plays GeiXfTo, ttq) and 
Gqj^{(Tk-,t^k) initiated in 7 and ^(7), respectively, is almost the same as 
above. □ 

Lemma [7771 and Lemma [7^ together imply Proposition 16.91 It remains to 
prove Theorem 16. Ill The strategy (Jq constructed in the proof of Lemma [7[8] 
is (T, =l)-winning in every configuration of ^^^^^F*. Since = and 
QIk = ^ by Proposition 16. 9| the strategy is (T, =l)-winning in every 
configuration of [T]^^. As it was noted in the proof of Lemma I7.8| the 
strategy is constructible in polynomial time assuming that the BPA game 
Qk has already been computed. Since Gx is computable by a determin- 
istic polynomial-time algorithm with NP fl co-NP oracle, the first part of 
Theorem 16.111 is proven. It remains to show that there is a regular strat- 
egy vr G n constructible by a deterministic polynomial-time algorithm with 
NP n co-NP oracle such that vr is (T, <l)-winning in every configuration of 
[T]<i = <^Wf * U For all X G and F G ^, let /^(X) and /^(F) be 
the least i and j such that X G ^ and F G "^j, respectively (note that this 
definition makes sense because ^ = and = by Proposition I6.9p . 
Further, for all a G F* and X G F we define 



max{0, Ls-{a{i)) | < z < |a|} if a G ^* 
00 otherwise. 

max{]:>hce<r^. (a), /^(X)} if a G ^* and X G ^ 
00 otherwise. 



• pricecg,^{a) = min{z;a/ue<:^.^(/3) | a = /3'y} 

• price{'~f) = min{pricec^t^{'-f), price,^*{'~f)} 

Let IZ be a strict (i.e., irreflexive) ordering over ^*£/r* U ^* defined as 
follows: a IZ /3 if either price{a) < price{P), or price{a) = price{(5) and 
pnce^^, (71) < ]9hcec^, (72), where a = 71//, (3 = 72^7, and r] is the longest 
common suffix of a and /3. One can easily verify that the ordering IZ is 
well-founded. Let 7r[>Vi4:], vrfZ^/^:] be the SMD strategies of Lemma [7.61 The 
strategy vr is defined so that the following conditions are satisfied: 

• if price{Za) = 00, then -niZa) is defined arbitrarily; 
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• \i 2 ^ and I^{Z) < price{Za), then TT{Za) = 7r[>Vi<-](2'a); 

• otherwise, ir^Za) = npixj^Za). 

Observe that vr is regular, and the associated finite-state automaton ^ is 
constructible in time polynomial in A if the strategies 7r[>V/<], tt[Uk], the sets 
and the functions I^, l^g have already been computed. Since all of 
these objects are computable by a deterministic polynomial-time algorithm 
with NP n co-NP oracle, the automaton is also constructible by a de- 
terministic polynomial-time algorithm with NP fl co-NP oracle. It remains 
to show that the definition of vr is correct, i.e., for every 7 G s^Y* U ^* 
and every a G S we have that V^''^ {Reach{T) , A) < 1. We proceed by 
induction with respect to the well-founded ordering C The only minimal 
element of ^WF* U ^* is £ where we have V^'"" {Reach{T) , A) = 0. Now 
let Za G ^*^r* U ^* be some non-minimal element. By Lemma 17.61 and 
the definition of vr we immediately have that either V2^{Reach{T),G^) < 1 
or V'2^{Reach{^a,Gi^) > where 7a C Za. In the first case, we are done 
immediately, and in the second case we apply induction hypothesis. 



8. Conclusions 

We have solved the qualitative reachability problem for stochastic BPA 
games, retaining the same upper complexity bounds that have previously 



been established for termination 13||. One interesting question which re- 



mains unsolved is the decidability of the problem whether val{a) = 1 for 
a given BPA configuration a (we can only decide whether player □ has a 
(=l)-winning strategy, which is sufficient but not necessary for val{a) = 1). 
Another open problem is quantitative reachability for stochastic BPA games, 
where the methods presented in this paper seem insufficient. 
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